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Abstract 

In this paper, we describe a general method for constructing the 
posterior distribution of an option price. Our framework takes as in- 
puts the prior distributions of the parameters of the stochastic process 
followed by the underlying, as well as the likelihood function implied 
by the observed price history for the underlying. Our work extends 
that of Karolyi (1993) and Darsinos and Satchell (2001), but with the 
crucial difference that the likelihood function we use for inference is 
that which is directly implied by the underlying, rather than imposed 
in an ad hoc manner via the introduction of a function representing 
"measurement error." As such, an important problem still relevant for 
our method is that of model risk, and we address this issue by de- 
scribing how to perform a Bayesian averaging of parameter inferences 
based on the different models considered using our framework. 
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1 Introduction 

In this article, we present a novel method for "integrating out" parameters 
from the risk-neutral pricing formula of an option. We are not the first 
to propose a method that uses this well-known Bayesian technique for the 
problem of option pricing. Previous innovations in the area of Bayesian 
econometrics on techniques for integrating out parameters after the risk- 
neutral pricing formula for the option has been derived in closed form include 
Eraker et al (2000). However, no work has been done, to our knowledge, on 
integrating out the parameters during the transformation of the physical 
measure P to the risk neutral measure Q, except when testing sequentially 
a precise hypothesis concerning the drift of a Brownian motion as in Rui 
(2002), or computing Bayes factors between different models as in Poison 
and Roberts (1994). 

Karolyi (1993), Darsinos and Satchell (2001) are perhaps the previous 
work most closely related to our own. In their articles, the authors use 
the Bayesian technique of integrating out parameters to derive the posterior 
distribution in closed-form for a European call option when the underlying 
follows a geometric Brownian motion. To do this, however, they use the 
sufficiency property of the unbiased estimator of a 2 from discreetly sampled 
observations. The method we propose differs crucially in this respect: to 
obtain the posterior of the option price, we avoid what may be considered 
an inconsistency in previous methods, by using the likelihood implied by the 
stochastic process of the underlying. This is important on practical grounds, 
because the different likelihoods, except in a special case, will lead to different 
posterior distributions for the parameters, and this affects inference. 

In general, the methodology developed in this paper is able to extend and 
yield the posterior distribution, in closed-form, as in the work of Karolyi 
(1993) and Darsinos and Satchell (2001). We construct these posterior dis- 
tributions by combining the likelihood function that is implied by the under- 
lying stochastic process with the prior distributions that are specified as the 
views of the market participant. 
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The method described in the remainder of our paper serves as an illus- 
tration of the linkages to be made between the broad areas of mathematical 
finance, on one hand, and Bayesian probability, on the other. We take a con- 
tinuous time point of view for the purposes of exposition, but the technique 
can be formulated in discrete time and implemented for practical purposes. 
This is a subject of future research. 

The outline of this paper is as follows. Section 2 reviews the literature on 
Bayesian option pricing and motivates our framework. Section 3 presents the 
methodology and describes how to find a likelihood function (Q the Radon- 
Nikodym derivative of the risk neutral measure with respect to the physical 
measure P) by the use of the Esscher transform. It also addresses the question 
as how to choose acceptable prior distributions for continuous-time finance 
model parameters in order to perform a Bayesian analysis. Section 4 illus- 
trates the methodology with two examples pertaining to the classical Black 
and Scholes model, as well as in a diffusion case. Section 4 concludes. 

2 Previous work on Bayesian option pricing 

The mainstream Bayesian literature has concerned itself with using state- 
space models as a way to get posterior distributions for derivatives perturbed 
around a Black & Scholes price of the following sort: 



where Wt is a Brownian motion, the error term e t ~ ^(0, v 2 \ and BS(a, St) 
is the option price from the classical Black and Scholes model. A more in- 
depth discussion of this approach can be found in pp. 35-36 of Johannes and 
Poison (2002). Both Johannes and Poison (2002), as well as Darsinos and 
Satchell (2001) obtain the posterior distribution of the volatility parameter a 
from the discrete time version of the continuous time process. Although this 
posterior distribution exists in discrete time, however, in continuous time it 
is a degenerate point mass, as Poison and Roberts (1994) explain, under a 
particular reference probability measure. 

In order to get the posterior distribution of the theoretical Black and Sc- 
holes price, Johannes and Poison (2002) use a perturbation e t around the 




(1) 
(2) 
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theoretical Black and Scholes price to construct a likelihood. Our frame- 
work consists of retrieving the likelihood function directly from the Radon- 
Nikodym process (Zf* = ^ \jr t ) used when performing a change of measure 
from to the physical measure P to the risk-neutral measure Q, and where 
8* is a function of the vector parameter 9 governing the probability distri- 
bution of the logarithm of the exponential Levy process S t . Combining our 
likelihood (Z^*) -1 = ^ \jr t with a prior n{6) enables us to derive a poste- 
rior distribution for ir(6\{S s : < s < £}) given an observed price history 
{S s : < s < t}. Darsinos and Satchell (2001), working within a Black and 
Scholes model and using the density of the distribution of the underlying 
with respect to the Lebesgue measure, are able to find a closed-form solution 
for the posterior of the call option. Our method can be carried out with the 
use of numerical simulations from the parameter's posterior distribution by 
standard Bayesian numerical methods as we illustrate throughout the paper. 

Besides allowing us to integrate prior beliefs about the price process pa- 
rameters together with the likelihood inherent in the price process to generate 
a marginal distribution for the price of the option, our method has another 
theoretical motivation as well. In the method summarized by equation^ and 
equation^ the parameters a and v are estimated jointly using the call option 
price data and the price data for the underlying (see Johannes and Poison 
(2002) for the details). As a consequence, both of these parameters capture 
the joint effects of risk aversion and price volatility on option pricing, and it 
is somewhat unclear how to sort out these effects in that model. This is not 
surprising, since in that case the error term is interpreted as some sort of 
observation error. 

The classical framework of option pricing supposes a call option C(t, St, 9) 
whose payoff H(St) depends on our underlying S t , and can be computed via 
the following integration l : 

C(t 7 S u 9) = exp(-r(T-t))E Q {H(S T )\F t } 

where integration is performed under the risk-neutral measure Q, such that 
the discounted stock price exp (—rt) S t is a Q-martingale. General integration 
theory states that the following change of measure is also possible by invoking 

1 For a European call option, the payoff function is equal to max(S , T — K, 0) where K 
is the strike price at termination date T. 
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the Radon-Nikodym theorem: 

C(t,S„0) = ew{ -r(T-t)) Erl f^l\ r ' } 

and if considering a prior ir(d6), we can perform the following integration 
with respect to the prior ir(d9): 

C(t,S t ) : 



exp(- 


-r(T 




| c(t,s t ,6)v(de) 
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/ : « < t)7r(d0 | F t )Ev{Z$ H{S T ) \ T t } 
le 



where 9 6 6, 6 is the parameter space, (Zf ) 1 = J| Ij^, and 

is the marginal probability distribution of the underlying St, which is thus a 
constant given the information Tt- 

The posterior distribution n(d6 \ Tt) comes from a direct application of Bayes 
rule: 

jr(d9)(| 

7T(^ I jF t ) 



^(5 U :u<t) 
ir(d9\F t )g(S u :u<t) = n(d0)(^- \ Tt ) 



This interesting result shows that in order to integrate out the uncertainty 
related to the governing parameters from the probability distribution of the 
underlying, one needs to use the likelihood that comes automatically by the 
specification of underlying through J| \^ t . 

It is often the case that competing models could have generated the under- 
lying process S t . Let us suppose that we have k of those competing models 2 

2 Example of such models could be the case where the underlying process St is described 
either by a jump-diffusion, diffusion, or by a pure jump Levy process, or by the same 
process with different parameters. 
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each of which has a prior probability of P(Mj), a Radon-Nikodym derivative 
JjS^ |jr t , and a model parameter vector for z = 1, ■ • ■ , k. These vectors 
8 l might have different meanings depending on the model they refer to. If 
we wish to see which of the models models the data best, then one could 
compute their posterior probabilities. The marginal likelihood P(5t | Mj) of 
model Mj gives a measure of observing the data given that Mj is true. Each 
different competing model will give rise to a different vector parameter l , 
having similar or completely different interpretations. This last quantity can 
be computed as follows: 

f dF i 

P(^:«<t|M,) = J(—\ rt )n{d0 t ). 

The marginal likelihood together with the prior probability on model i en- 
ables to use Bayes theorem which updates our prior beliefs that the data 
comes from model Mj as follows: 

mi \S,:u<t) - ^:«<t|M,)P(M,) 



E7.iP(S«:«<*|M j )P(M i ) 

As Cont (2006) points out, model uncertainty regarding option pricing 
models leads to "model risk" which can be seen as an extension of model 
misspecification. Bayesian model averaging is a way to incorporate model 
uncertainty for option prices, where one could compute an option price inte- 
grating model uncertainty in the following way: 

k 

C(t, S t ) = ]T C(t, S t | M)P(M I S u : u < t) 
i=i 

where the C(t, S t \ Mi) are the option prices C(t, S t ) computed under model 
Mj. C(t, S t ) would thus yield a model weighted "average" option price. 

Although we do not always have a closed-form solution for the integral 
J e (^T \^ t )7r(d9 l ), we can approximate it through classical Markov Chain 
Monte Carlo methods to get the marginal likelihood. In order to do this, we 
need a likelihood and a prior distribution on the model parameters in order 
to perform a Bayesian analysis. In the next section, we show how to find this 
likelihood. 
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3 The Framework 



3.1 Preview of coming attractions 

As motivation for the material in the following sections, we present a version 
of the Black-Scholes model in discrete time, which stands between the bino- 
mial model and the continuous time Black-Scholes model, in which the pos- 
teriorization procedure does not involve stochastic calculus. Assume that our 
price process is described by a sequence S n = e^^Sn-i, and So G R is given, 
and where the sequence {R n \ n > 1} is an i.i.d sequence of N(0, 1) random 
variables. We take (Q = M N , T ni J 7 , P) as sample space, where R n will denote 
the coordinate mappings and T n = <r{Rk \ k < n}, T = a{Rk \ k < oo} and 
P is the obvious infinite product of gaussian measures on (f2,jF). 
It is a simple calculation to verify that the value of a such that 

is a = (^-^f) where v — // — cr 2 /2. Also, it is a standard exercise to verify 
that the limit Z of the Wald martingale Z n = YYi=i e aRn ~ a2(j2 1 2 provides us 
with a measure Q ~ P such that Z n = dQ/dP\r n and that the present value 
price process S* = S n e~ r is a Q-martingale. 

It is also simple to verify that the likelihood Z~ l = e ~ a T,i Rk+na a /2 _ 
dP/dQ\p n is a Q- martingale. The interest in Z~ l is that it can be thought 
of as the conditional density j/§(So, S n \[A). To see how this comes about, 

note that J^ =1 Rk = Y^k=i l n \Sk/Sk-i) — an d after completing squares 
and re-arranging the exponent we obtain 

n (j 2 

Z; i 1 (SQ,...,S n \n) = exp [— (r + — - In (S n /S Q )/n) ] x 

2 

x ex P [ - ^2 iy + y - ln {s n /s ) /n) 2 ] . 

From this point on, the marginalization to obtain the posterior of fi given 
So, S„ is a routine matter in the gaussian set up. Let us now proceed to 
the continuous time case. 
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3.2 Change of measures and the likelihood function in 
option pricing 

It is common to use the following form when modelling the underlying St of 
a derivative: 

S t = exp (X t ) 

where X t can either be a Levy process or a diffusion, and thus S t is the 
exponential 3 of this process discounted by exp(— rt) . In order to price op- 
tions, one needs to find an equivalent probability measure Q ~ P such that 
St = exp (— rt + X t ) is a martingale under Q. 

When performing a change of measure for a given stochastic process St 
under P to Q, one can regard S as a random variable on the space Q = 
D[0, +00) of cadlag (Continue a droite limites a gauche) paths together with 
its associated filtration (J 7 t)t>o- This measure P is therefore defined on the 
space of sample paths of X, and so the Radon-Nikodym derivative ^ \jr t 
with respect to the reference measure Q is the likelihood function 4 after the 
process has been observed up to time t. 

It is easy to verify that the independence of the increments of X imply 
that Z e t * = exp (9+X t — tk{9*)). Furthermore, Z t * is not only a positive jF r 
martingale under P e , but happens to be the likelihood function given time t. 
What is interesting here is that S% = S t exp (rt) = exp (X t — rt) is a Q 9 *- 
martingale for every 9 as is easy to verify. This observation will enable us to 
use Bayes theorem in the following sections to compute posterior distributions 
of the parameters that govern the dynamics of the underlying St- Since we 
will mostly work with continuous stochastic processes, we set Q = C[0, 00), 
and 5 (Q, J 7 , P) is the standard Wiener space. 

3 See Applebaum (2004) and Oksendal (2003) for the case when X t is a Levy process 
and a diffusion respectively. 

4 See chapter X, section 2 of Jacod and Shiryaev (1987) on the equivalence between 
Radon-Nikodym derivatives and likelihood function. 

5 C[0, 00) is the space of continuous functions defined from [0, 00) into R. 
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4 Parameter posteriors for some popular con- 
tinuous time models of the underlying 

4.1 Geometric Brownian motion 

In the classic paper by Black and Scholes (1973), the stock price S t is solution 
to the following SDE: 



dSf 

— — = fiat + adw t 
St 



whose solution is equal to: 

S t = S exp 



a 



(fi - —)t + aW t 



(3) 



(4) 



Working with the discounted stock price, we obtain 



exp (—rt) S t = So exp 



a 



(5) 



Invoking the Cameron-Martin-Girsanov (CMG) theorem enables us to de- 
termine the deterministic risk-neutral condition \i = r which is the one that 
defines the unique martingale measure. We then get that St is a martingale 6 
under Q and is equal to: 



St 



Sq exp 



a 



[r - —)t + aW t 



(6) 



As it turns out, the martingale condition \x = r holds asymptotically in 
our Bayesian framework, as t tends to infinity, but for all finite t the value 
of n consistent with no-arbitrage has a posterior distribution that is non- 
degenerate. 

Theorem 1. Under the model given by equation from the Cameron- 
Martin-Girsanov theorem the density process Zf is given by 



exp 



W, 



r — ji 



a 



t f r — fj, 
2 



a 



(7) 



3 St is a Q-martingale, which is equivalent to showing that ZfSt is a P-martingale. 
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and the posterior distribution for the drift parameter \i is given by 



aW t a 



2" 



[fJ.\{S s ■ < s <t}} 



N r - 



t ' t 



(8) 



Proof. See Appendix. 



□ 



Theorem illustrates that, even in the setting where markets are com- 
plete and we can sample continuously, we obtain a posterior distribution for 
the drift parameter \x that differs from the usual no-arbitrage condition that 
\i — r with probability 1. We recover this no-arbitrage condition in the limit 
as time goes to infinity, and when time is finite, we have that if W t > 0, then 
E[n\{S s : < s < t}] = r — ^j 1 < r, and vice versa. 

2 

One important observation is that as t — > +oo we get -. ► and thus 

fi — > 6q = r since < ^ ± — > as t — > +oo. Therefore, the posterior distribu- 
tion [/i|{5 s : < s < t}] is consistent at r = 6 . This is a property that a 
posterior distribution should have in general if we wish to learn more about 
a parameter, and is the Bayesian analogue of consistency for an estimator in 
classical statistics. We now do a variation on the definition of consistency 
from Ghosh and Ramamoorthi (2003) and propose the following one: 

Definition 1. For each t, let ir(6\S s : < s < t) be a posterior probability 
distribution given {S s : < s < t}. The collection {tt(6\S s : < s < t)} is 
said to be consistent at 9q if there is a Qq C Q = C[0, oo) with Fg (Q ) = 1 
such that if to is in Q , then for every neighborhood U of 9 , 



where ¥g is a probability measure defined on the space of right continuous 
functions with left limits, and St is the underlying process. When perform- 
ing a Bayesian analysis, it is important for the posterior to converge to a 
degenerate point mass at the unknown parameter 8 , which means in our 
setup, the posterior variance tends to 0. In the previous example, we obtain 
consistency as the posterior variance > 0. It is worth to notice that Zf 



depends on the initial and final values So and So through log y^J- Working 
with continuously compounded returns enables us to use the same definition 
of consistentcy as in Ghosh and Ramamoorthi (2003), as well as to prove 
that two agents having different prior distributions 7r(0) reach in the end 



%{U\S S : < s < t) -»• 1 a.s. P 6o 



(9) 
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the same conclusions regarding 9. Throughout our examples we shall show 
that our posterior distributions are consistent for both ji and a 2 . We now 
cite from Ghosh and Ramamoorthi (2003) the following second definition of 
consistency as well clS db theorem: 

Definition 2. For each n, let ir(9\Rl, . . . ,R%) be a posterior given the iid 
sequence R], . . . ,R™. The sequence {n(9\Rl, . . . , R™)} is said to be consistent 
at 9q if there is a Wo C W with 7 P^(f2o) = 1 such that if uj is in Qq , then 
for every neighborhood U of 9 , 

tt(U\RI,...,R?)^1 a.s. P~ (10) 

where R\ = log(^f) is the return during the i-th interval of length t. SI and 

Sq are the prices at the beginning and the end of the i-th interval of length t. 
We have n of those independent returns since for a given time interval [0, T] 
such that T = nt, we use underlying processes for S t with Stationary and 
Independent Increments. 

What if two different agents used the same probability model that gen- 
erated the underlying price process S t , but had different prior distributions 
regarding the parameters governing the latter. Would both posterior dis- 
tributions be consistent at the same value 9q1 In other words, under what 
conditions would two different priors lead to the same inference? In order to 
answer the latter, we cite from Ghosh and Ramamoorthi (2003) the general 
definition of consistency for a posterior distribution, together with a theorem. 

Theorem 2. Assume that the family {Pf : 9 G A} is dominated by a a-finite 
measure [i and let pe denote the density o/Pf. Let 9$ be an interior point 
of 0, and 7Ti, 7r 2 be two priors densities with respect to a measure v, which 
are positive and continuous at 9 . Let 7r(9\Rj:, . . . , R™)i, i = 1, 2 denote the 
posterior densities of 9 given {R], . . . , R™}. If n{9\R], . . . , R?)i, i = 1, 2 are 
both consistent at 9q then: 

\hn n f\Tr 1 {9\Rl...,R?)-ir 2 {9\Rl,...,R?)\dv{9) = a.s. F 0o (11) 

where Pf is the probability distribution of the return R t = log(|^). 

7 Here W = K°° is the space of infinite sequences, Pg° = ®Pe is the product probability 
measure defined on the sigma algebra B(M°°) of W. Here Pg a is the probability distribution 
of the returns R\ with respect to Qe - 
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This last theorem shows that as n increases (and therefore our informa- 
tion set increases), the importance of the prior distribution fades away (and 
thus the importance of the prior's hyperparameters as well), since we get to 
observe more and more data. In our framework, the dominating measure /i 
of Ghosh and Ramamoorthi (2003) is our Q and their family Pf corresponds 
to our physical probability measure regarding the returns R t . 



4.2 A Bayesian Inference of the Black & Scholes model 

As we just saw in the previous section, the log returns of the stock price 



N 



«1 si 

2 ' t 



are normally distributed 8 , thus the conditional like- 
lihoods for fi and a 2 in the Black & Scholes model are proportional to (see 
appendix): 



VlMy),/i) oc (cx^^exp 



1 

2 

log | 



t(fi- 



So \2 



a" 



+ 



to* 



-1 2 



(12) 
(13) 



The first conditional likelihood given /i is nothing but proportional to a Gen- 
eralized Inverse Gaussian with parameters A, 5, and 7 (in short GIG(\, 5, 7)) 
whose density is equal to: 



f(x I A, 5, 7) 



5 ' 2K x (~/5) 



x (x ~ 1] exp 



h 7 x 

x 



(14) 



where we use the same parametrization as in Silva et al (2006). We note 
that equation (|12|) is proportional to a GIG(X,5, 7) with parameters equal 

1 s 2 = t(u - ^Jm 2 , and 7 2 - 1 



to A 



- l(fi ~r £L )' 2 ^ an d l' 2 = 1 5 an d that the conditional posterior 



t ^ 2 ' t 



. We 



distribution of \i given a 2 is normally distributed as N 

prove these last statements and more in the following two lemmas whose 
proofs are in the appendix. 

8 Here our Radon-Nikodym derivative |jr t is with respect to Lebesgue measure A. 
One can compute this ratio with respect to the probability measure P using the following 
cnam rule. w \ Tt - \ Tt w \ rt . 
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Lemma 1. When choosing flat uniform prior distributions on M and IR + for 
fi and a 2 in the Black and Scholes model, then their posterior conditional dis- 



tributions are 7r(/i | er 2 ,log J^) = N 



t ' 2 > t 



and 7i(cr 2 I fi, log 



lof 



Si 



GIG(X,5,j) respectively, where X = \, 5 2 = t{jJL r 2 ") 2 ; an d 7 2 = \> f or 

uniform priors on defined on R and M. + respectively. 

Proof. See appendix. □ 

Lemma 2. When chosing normal prior distributions fvr(/i) = N[m,s]) on 
M. for fi, and a Tx(a 2 ) = GIG(\,5, 7) on K + for a 2 in the Black & Sc- 
holes model, their posterior conditional distributions are 7r(/i | a 2 ,log|^) = 

and tt(o- 2 \ /i, log|^) = GIG(X', 5', 7'), w/iere 



A/ 



A' — A - 

Proof. See appendix. □ 

When modelling the underlying with a given process in continuous time 
(in this example a Geometric Brownian Motion), there is already a likelihood 
function implied by this latter. This methodology proposes an extension of 
Karolyi (1993), Darsinos and Satchell (2001), where their posterior for a 2 is 
an Inverse Gamma distribution, which appears to be a specific case of the 
GIG(X,5, 7) as explained in Silva et al (2006). Furthermore, both Karolyi 
(1993), Darsinos and Satchell (2001) construct the likelihood for a 2 relying 
on the mathematical result that when log returns are normally distributed, 
the statistic 9 ~ X 2 ( z/ ) an d independent of [i since the latter is sufficient. 
Our extension relies on a more general mathematical framework that consists 
of working with the Radon-Nikodym derivative Z t instead 10 . 

With this example we illustrate one of the core points of this paper that 
one is not free to elicit a likelihood when imposing a mathematical model 
for the underlying stochastic process, which already has one. These two 
likelihoods can lead to different inferences for the parameter a 2 unless they 

9 X 2 ( l/ ) i s a chi-square random variable with v = n — 1 degrees of freedom and s 2 = 
10 Here the reference measure is just Lebesgue measure A. 
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are by coincidence proportional one to another as Berger and Wolpert (1984) 
point out. 

From the Cameron-Martin-Girsanov theorem, the likelihood function for 
a general diffusion process dX t = f(6, t, X t )dt + a(t, X t )dW t is given by 11 : 

7 f f' (r-f(e, S ,X,)) Jt fs,\\ 



where St = 5 exp ix t — | J* a(s, X s ) 2 ds^ is the stock price process. Pol- 
son and Roberts (1994) compute posterior probability distributions as well 
as Bayes factors for the drift of the diffusion when f(9,t,X t ) = 9f(t,X t ). 
Our posterior distribution for the drift reduces to example 4 in Poison and 
Roberts (1994) when r = 0. We not only develop a Bayesian methodology 
for consistently estimating parameters in the context of option pricing, but 
also compute the posterior probability distributions for both a 2 and /x, as 
well as in continuous time, extending the work by Karolyi (1994). 



5 Discussion and Conclusions 

When traders and market participants use pricing formulas for derivatives, 
the price they obtain is a function of the parameter values that they as- 
sume. However, it is reasonable to expect that these parameter values are 
not known with certainty. As a result, it is worthwhile to take this uncer- 
tainty about parameter values into account in option pricing. The natural 
way of doing this is through the use of Bayesian methods. In this paper, 
we present a framework for Bayesian option pricing that can be applied to 
a very general set of stochastic processes for the underlying. Using directly 
the probability model for the stochastic process of the stock price process, 
Girsanov's theorem helps us derive the likelihood function which one of the 
key ingredients that enables us to yield posterior probability distributions for 
model parameters. 

We show that in our framework, in which the posterior distribution for the 
parameter estimates is computed using the likelihood implied by the under- 
lying stochastic process and prior distribution that represents agents ' beliefs 

n See Oksendal (2003). 
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about the parameter values, the likelihood function is constructed from a 
martingale. Asymptotically, as time goes to infinity and we collect more 
data, our prior beliefs can be updated. Moreover, we discuss our frame- 
work with the example of geometric Brownian motion, where our findings 
are consistent with those of Poison and Roberts (1994) in the context of the 
posterior distribution of the drift \i. We extend the methodology by Karolyi 
(1993), by noticing how to retrieve the implied likelihood imposed by the 
stochastic process of the underlying. In this sense we show that when work- 
ing with derivatives, it is important to have coherence between the likelihood 
implied by the stochastic process of the underlying and the likelihood of the 
econometric model used for inference. 

In terms of future research, the most worthwhile application of these tech- 
niques would probably be to the case of a tractable version of a general Levy 
process, which in continuous time yields nontrivial likelihoods for the case of 
jump-diffusion processes for all of the parameters, as we see in the appendix. 
In addition, a potentially fruitful application of this technique in discrete 
time, to binomial tree pricing methods, is the subject of current research. 



Appendix 



Posterior for \i 

It is a standard computation to show that the value of a G K such that 
Z1 = exp \ aWt — ^yj is a density for the risk neutral measure Q making 

Z^S t exp (— rt) into a P-martingale is a = If 9 = (/i, a 2 ) are the model 
parameters, then 9* = a{6) = 

We see how 9* is a function of 9 = (/x,cr) and makes St = exp(X f ) into a 
Q-martingale, where the Radon-Nikodym derivative ^§ \jr t is equal to: 



— \ rt = exp ( 9*X., 



a 2 9 2 



Substituting 9* in the Radon-Nikodym derivative and using a flat uniform 
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prior for /i, the posterior is proportional to: 



oc exp^A^-2Mr-^ 
oc exp I ———[ij, 



2a 2 ^ v t 



We conclude that the posterior distribution for \x is: 
N \ r ~ ^ T 

A Posterior probability distributions for Jump- 
Diffusion processes 

Consider first the case where the price process is St = e ut+aW \ with W 
as above and v — /j, — a 2 /2. In this case the getting {Z e t *)" x ready for 
posteriorization will be pretty much like in section 3.1 Note to begin with 
that from dS t = fiS t dt + aS t dW t and Jto's formula that on the one hand 
adWt = dSt j St — fidt and on the other hand, therefore 

d(ln (S t /S ) + °^-) = dSt/St 

and therefore that 

aW t = In (S t / So) - tu. 

With all this, just some arithmetics allows us to rearrange the exponent in 
(Z^y 1 = exp [ - ^aW t + , where recall, 0* = (r - /x)/cr, so that 

(*r' - «p [^(r - £ + »^ - ^ - y + 

When the asset prices are driven by discontinuous factors, part of the 
routine is easy, but part is not. Assume for example that dS t = fiS t dt + 
aS t dW t + dJ(t) , where the first two terms in the right hand side are as in the 
previous example, whereas J{t) = Yln=i £™ is a compound Poisson process 
such that N(t) is a Poisson Process with intensity A and the i.i.d. sequence is 
independent of both W and N, and for the sake of simplicity, let us assume 
the £ n to be bounded. 
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The price equation has solution 

N(t) 



S t = e t ^ 2 ^ Wi H(l + U = e t 



D t(fJ,-a 2 /2)+aWt+Y{t) 
n=l 

where Y t = ^2n=i m (l + £«) = ^2n=i Vn- Clearly we must assume that 
(l+£n) > for the price process to be positive. The boundedness assumption 
on the jumps yields the existence of J (e 9x — l)n(dx), with n(dx) = dP(rji < 
x). With this, clearly E[e e ^ m+Y ^} = e tk ^ e \ with 

k{B) = a — + A / (e ex - l)n(dx) 
This time when we ask for the existence of a value 8* such that 

E[e e(aW t +Y t )-tk9) St] = £ t 



tr 



we are led to the equation k(6+ + 1) — = r — u, with v as above. This 
amounts to solving 

0* + A J{e x - l)e x6 *n{dx) = 

When 6* ranges from — oo to +oo, the left hand side of the identity above 
, ranges in an increasing way over the same range, thus the equation has a 
solution. So there is at least one risk neutral measure for the given asset 
price. Furthermore, it is not hard to see using /to's formula that this time 
we also have 

aW t + Y t = ln (S t / S ) - tv 

but the likelihood 

Z e * = exp~ 9 * (,jH/t+y ' )+a ' (9 * ) = e - *( ln (St/s o )-tu)+tk(e+) 
does not lead to a posterior distribution for 9 easy to sample from. 

B Proof of Lemmas 1 and 2 

In this section we derive the full conditional posterior distributions for both 
/i and a 2 in the Black & Scholes model. In their model, the log returns are 
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normally distributed as 



N 



fi - 



2 ' t 



Once we observe the whole 



sample path St, we get the following function for \x and a 



2. 



L(/^ 2 | logA) 

<J0 



exp 



2(7 2 



lo sf 



-I 2 



(15) 



Proof for Lemma ((TJ : 

Proof. Since the priors for both /j and a 2 are both flat priors, the joint 
posterior distribution is proportional to the likelihood function, which is equal 
to equation (|15jl. Keeping only the terms in a 2 in equation (|15|l. we conclude 
easily that the conditional posterior distribution for a 2 is a GIG(X',5', , -f') 



where A' = |, <5' 2 



t(/x 



log 



^) 2 , and 7 



/2 _ t 



distribution for ji is a iV 
Proof for Lemma 



i ^ 2 ' t 



The conditional posterior 

□ 



Proof. By Bayes rule, the posterior n(a 2 | log(^),/i) for a 2 given /x and the 
data is proportional to: 



S 

oc L(a 2 | log(-^),/i)vr(a 2 ) 
•jo 



oc (<r 2 )^ x ) exp < — 



to- 2 tQu-^1) 2 ' 
4 a 2 



7r(a 2 ) 



and the likelihood 7r(cr 2 | fi, log( J^)) times the GIG(\, 5, 7) prior distribution 
7t(<t 2 ) yields: 



(a^M-Dexp 



1 
2 



log 



St 



4 a 2 



which is proportional to a GIG(X',5',^f'), where X = X — 7;, &' 2 = t(fi 



log 



^) 2 + <5 2 , and 



7 /2 = | + 7 5 
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The full conditional posterior distribution for /i given a 2 is given by 7r(/i 
<r 2 ,log(J^)) and is proportional to: 



oc 7r(/x) exp 



t 



oc exp 



oc exp 



oc exp 



oc exp 



t 



2a 2 
log 



log — 



<-T> 



St 
So 



2a 2 

(* + ?) 
2a 2 



2a 2 



(* + ?) 
2a 2 



^ma 2 + ts 2 



fl 2 -2fJL 



it 
s 



fi 2 - 2fi 



ts 2 + a 2 



where vr(/x) is a N(m, s 2 ). We conclude therefore that the conditional poste- 



rior distribution for \x is distributed N 



<+4 



t+4 



, which 



is the same conditional posterior distribution as in Poison and Roberts (1994). 

□ 



B.l Consistency of posterior distributions for fi and a 2 

We now prove consistency of the posterior distibutions for both u and a 2 in 

2 

the sense of definition 1. The posterior variance of \x is -^r and goes to 

as £ — > oo, which together with Chebychev's inequality shows consistency. 

The posterior variance of a 2 given the data and /x is given by Corollary 
1.1 from Silva et al (2006): 

Var(a 2 \ \,5 n ) = 



K X+2 { 7 X) ( K x+1 (j\) \ 2 
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where K u (uj) is the modified Bessel function of the third kind and equal to 
K v (oj) = | J °° x v ~ x exp {— -a; (x + dx. By noticing that 

lim K v {u) = 



, K v {oS) ~ lo 2 exp (—a;) (for a; very big), and that K y {uj) > 0, we get 
the following inequalities for the posterior variance as uj — > oo: 



Var(a 2 | A, 5, 7) 



^a(7A) 



a; 2 exp 



as uj — > 00. 



2 exp (— a;) 



2 exp {—uj) 



uj 2 exp 



-uj 



This last result shows that the posterior variance of a 2 goes to zero as 
uj — ► 00. Combining this last fact together with Chebychev's inequality 
shows consistency of the posterior probability distribution of both a 2 and \x. 



B.2 Application of lemmas 1 and 2 for Gibbs sampling 

Given the full conditional distributions for both /x and a 2 from lemma 1 and 
2, we can use them to construct a Gibbs sampler in order to get the posterior 
probability distributions for both \i and a 2 in the Black & Scholes model. 
We now present the Gibbs algorithm: 

• Initialize both /i° and a 2,0 with starting values 



Draw ji l ~ N 



-■2,1-1 
m- — j hlog 



St 1 to 



• Draw a 2 ' 1 ~ GIG(\',8',~f'), where the parameters A', 5', and 7' are 
given by the above lemmas 

Repeat for i = 1, • • • , / where I is big enough. 
The posterior for \i given a 2 converges to its posterior mean since the poste- 
rior variance for both \i and a 2 goes to as t — >• 00 from subsection IB. II 
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